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Abstract 


In this paper, we describe a proof-of-concept implementation of the probabilistically checkable proof 
of proximity (PCPP) system described by Ben-Sasson and Sudan in [BSSO5]. In particular, we implement 
a PCPP prover and verifier for Reed-Solomon codes; the prover converts an evaluation of a polynomial 
on a linear set into a valid PCPP, while the verifier queries the evaluation and the PCPP to check that 
the evaluation is close to a Reed-Solomon codeword. We prove tight bounds on the various parameters 
associated with the prover and verifier and describe some interesting programmatic issues that arise 
during their implementation. 


1 Introduction 


A probabilistically checkable proof (PCP) system specifies a format for writing proofs that can be verified 
efficiently by querying only a few bits. Formally, a PCP system consists of an input string, a source of 
random bits, a proof string, and a probabilistic polynomial-time Turing machine called the verifier. The 
verifier has random access to the proof; given an address of a location in the proof, the verifier can query 
that location in the proof as a single oracle operation. A PCP verifier V with perfect completeness and 
soundness s(n) for the language L satisfies the following conditions: 


e For every input z in L, there is a proof II such that V accepts with probability 1. 
e For every input xz not in L and for every proof II, V accepts with probability less than s(n). 


Furthermore, a language L is said to be in PCP[r(n),q(n)] if there is a PCP verifier for L that on each 
input of size n uses at most r(n) random bits and queries at most g(n) bits of the proof. The celebrated 
PCP Theorem states that for any language in NP, there exists a PCP verifier with soundness 1/2 that uses 
O(log n) random bits and queries O(1) bits of the proof. Hence, the size of the proof needed by the verifier 
is 2O(loen) — poly(n), polynomially larger than the size of the NP-witness. 

Subsequently, much work has been done in trying to reduce the length of the proof and to make its 
constructions simpler. The length of the proof is relevant to applications of PCP theory in cryptography 
and to constructions of locally testable codes (LTCs). Moreover, there is the possibility that a PCP system 
with short proof size could form the basis for a semantic analog of error-correcting codes. Simplifying 
the proof construction is also important for this reason. Some progress toward these goals were made in 
[BSS05] where Ben-Sasson and Sudan showed that there exist probabilistically checkable proofs for verifying 
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satisfiability of circuits of size n of length n - poly(log n), with the verifier querying poly(logn) bits of the 
proof. Moreover, the construction of the proof is significantly simpler than in previous PCP constructions. 
Their Theorem 1 states: 


Theorem 1 ([BSS05], Theorem 1): SAT has a PCP verifier that on inputs of length n tosses 
log(n - poly(log n)) coins, makes poly(logn) queries to a proof oracle of length n- poly(logn), runs in 
time n - poly(logn) and has perfect completeness and soundness at most 3. 


This PCP construction involves the construction of probabilistically checkable proofs of proximity 
(PCPPs) for Reed-Solomon codes. PCPPs provide an even stronger restriction on the verifier’s computation, 
compared to the standard PCP model. Whereas a PCP verifier has unrestricted access to the input string 
but is restricted to making only a few queries to the proof, a PCPP has restricted access to both the input 
and the proof. Formally: 


Definition 1 (PCPP) A set C C bX” has a probabilistically checkable proof of proximity over 
alphabet X of length ¢(n) with query complexity g(n), perfect completeness and soundness s(-,7) if 
there exists a verifier V with oracle access to a pair (z,7) € E+“ such that V tosses r(n) coins, 
makes g(n) queries into (z,7) and accepts or rejects as follows: 


— If 2 € C, then Ja € X“™) such that verifier accepts (z,7) with probability 1. 
— If A(z, C) > 6, then Vr € “™), verifier rejects (x, 7) with probability at least s(6, 7). 


[BSS05] provides efficient PCPPs for Reed-Solomon codes, which are defined next: 


Definition 2 (RS-Codes) The Reed-Solomon code of degree d over a field F evaluated at S C F 
is defined as RS(F, S, d)={(P(z))z-5 : P(z) = 3) a;z*,a; € F}, where (P(z))2.-s, the evaluation 
table of P over S, is the sequence (P(s) : s € S) and S has some canonical ordering to make it a 
sequence. 


The primary result in [BSS05] regarding PCPPs for RS-codes that we are concerned with is the following: 


Theorem 2 ([BSS05], Theorem 4) There exists a universal constant c > 1 such that for every 
field F of characteristic two, every linear S C F with |S| = n and every d < n, the Reed-Solomon 
code RS(F, S,d) has a PCPP over alphabet F with proof length I(n) < nlog*n, randomness r(n) < 
logn + cloglogn, query complexity g(n) = O(1), and soundness s(6,n) > 6/log® n. 


In this paper, we describe an actual implementation of this PCPP system for Reed-Solomon codes. 
Specifically, the following two programs are implemented: 


1. A prover that receives as input a description of the field F = GF(2'), a basis (b1,...,b,) for L C F, 
a degree parameter d and a polynomial P : L — F of degree less than d, and that outputs a PCPP 
which is supposed to prove that (P(z)),,1 is in RS(F,L,d). 


2. A verifier that receives as input a description of the field F = GF(2!), a basis (bi,...,0,) for LD C F, a 
degree parameter d and oracle access to a purported RS-codeword p: L — F and its purported PCPP 
ma, and that accepts or rejects based on the proximity of p to RS(F,L,d). 


In the following, we detail these implementations and provide some tight bounds on the various com- 
plexity parameters associated with the PCPP system. These results establish that the constants associated 
with the PCPP size are not at all large and, so, could perhaps motivate the use of probabilistically checkable 
proofs in real-life as analogs to error-correcting codes. 


2 Implementation of the PCPP system 


The most basic operations in constructing and verifying the probabilistically checkable proofs of proximity 
described in [BSS05] are addition and multiplication in fields of characteristic two, extension fields of GF(2). 
To do these operations efficiently while maintaining a proper programmatic abstraction, I used the excellent 
C++ library NTL, developed by Victor Shoup [Sho]. NTL is a high-quality and portable C++ library 
providing an efficient programmatic interface for computations over finite fields. Our PCPP prover and 
verifier programs are implemented as dynamically-linked C++ libraries with dependencies on the base NTL 
library. Thus, users of our implementation can link to our prover and verifier modules to create a valid 
PCPP and verify provided PCPPs respectively. 

NTL represents elements of the field GF'(2') as polynomials in GF'(2)|[z] modulo an irreducible polynomial 
P of degree 1. Hence, in the following, I will view field elements as vectors from the (additive) vector space 
GF(2)'. For the prover to provide a proof acceptable to the verifier, it must use the same irreducible 
polynomial P as the verifier. Also, both must sequence the field elements in the same order, and both must 
use the same bases elements for any subspaces of F that are considered. 


2.1. Evaluation and Interpolation of Polynomials 
The following two problem need to be solved repeatedly while constructing and verifying our PCPPs: 


e (Evaluation) Given a finite field F of characteristic 2, coefficients co,...,Cn_1 € F and linearly 
independent elements e1,...,e, € F with n = 2*, compute the set {(a, p(a))|a € span(e1,...,e%)} 
where p(z) = ae cia. 


e (Interpolation) Given a finite field F of characteristic 2, linearly independent elements e1,...,e, € F 
and the set {(a,po)|a € span(ei,...,ex%)}, compute coefficients cg,...,Cn_1 € F such that py = 
n-1 4 
izo Cia” for all a € span(ei,..., ex). 


Both can be achieved with O(n log? n) field operations’ using a Fast Fourier Transform method. Here, I 
will describe the solution to the interpolation problem; the solution to the evaluation problem is very similar 
although not identical. The key ideas behind the interpolation algorithm are in the lemmas below: 


Lemma 1 Given e1,...,e, € F, there exists a monic quadratic g(z) such that for every a € 
span(€1,...,e€%-1), g(a) = g(a + ex). Also, there exists vectors e},...,e, , € F such that for all 
a € span(e1,...,é€x), g(a) € span(e},...,e,_,). Further, q and e},...,e,_, can be computed in time 
O(k). 


Proof Let g(x) = 2? — e, - x and let e; = g(e;) for 1 <i < k—1. Note that g(z+ y) = (x) + q(y) 
since we are in a field of characteristic 2. So, because g(e,) = 0, the first assertion is true. The second 
assertion holds since if a = Ss Aiei With A; € GF(2), then g(a) = ~ q(Aiei) = ei Aig(e:). 


Lemma 2 Given the set {(a, p.)|a@ € span(ei,...,e,)} and the monic degree 2 polynomial g and the 
elements {e/}*—} from Lemma 1, there exist sets {(a’, p°,)|a’ € span(e},..., e4,_,)} and {(a’, pi, la’ € 
span(e{,...,€, ,)} such that py = Poa) +a Pa) for all a € span(e1,...,e,%). Moreover, the two sets 


can be computed in time O(n). 


Proof Note that from the properties of g in Lemma 1, we want the two sets to be such that for all a € 
span(e1,...,€k), Pa = een +2 Pi) and Pate, = Piya) +(atex) Pie: So, Poras = €, + (Pate; — Pa). 
Also, then, PS a) = Pa- a Ds a) = Pa- é,° ‘(Pote, —Po). Assuming constant-time access to pa, these 
calculations can be done for all a’ = g(a) € span(e},...,€, 1) in time O(n). 


1Field operations take O(log|F|) bit operations and will be taken to have unit time cost. 


Lemma 3 Given coefficients of two polynomials p°(xz) and p1(z) of degree less than n/2 and any 
monic degree 2 polynomial g(x), then there exists a polynomial p(z) of degree less than n such that 
p(z) = p°(q(x)) + z- p'(q(z)). Moreover, the coefficients of p can be computed in time O(n logn). 


Proof The existence statement is clear. We just have to give an efficient algorithm to find the 
coefficients of p(x). First of all, write p°(z) = b°(z) + 2"/4a°(z) and p(z) = b(z) + z”/4a(z), where 
a°, a, b° and b! are polynomials of degree less than n/4. Recursively, we can find the coefficients of 
the polynomials a(z) and b(z), where a(x) = a°(g(x)) + z- a*(q(x)) and b(x) = b°(q(x)) + x -b*(q(z)); 
a(x) and b(z) have degrees less than n/2. Now, p(z) = b(x) + q(x)"/4 - a(x). Since n is a power of 2, 
if g(x) = x? + ce +d, then q(z)"/4 = 2/2 4 cM/4q7/4 4 gr/4 = 2/2 4 cle™/4 4 d!. Writing a(x) = 

n/2-1 B n/2-1 n/4—-1; 4 A n/2-1; y 

i-o x" and b(x) = )7;45 Giz", we can see that p(x) = D749 (d'ou + Bi)a’ + Vinh a (Mai + 
Ci n/a + G;)a* + yo eae + C Opt nja lt + ae Be ejae" Thus, we can get the coefficients 
of p(x) from the coefficients of a(z) and b(x) in O(n) time, and so the total time for the recursion is 


O(nlogn) as claimed. 


Given these lemmas, the interpolation algorithm follows: 


InvFFT-Additive(e1,...,e%, {(a, Pa)|a € span(e1,...,ex)}) 

. Compute q(x), e},...,e, 1 as by Lemma 1. 

. Compute {(a’, p®, )|a’ € span(e},...,e,, ,)} and {(a’, pi,,)|a’ € span(e},...,e, ,)} as by Lemma 2. 
. Compute p°(z) = InvFFT-Additive(e},...,e, 1, {(a’, po, la’ € span(e},..., e, 1) })- 

. Compute p'(z) = InvFFT-Additive(e},...,e, 1, {(a’, pila’ € span(e},...,e, 1) })- 

. Compute p(z) from p°(z) and p'(z) as by Lemma 3. 


or WN 


The running time for the algorithm is O(n log? n) because each recursion halves the span of the bases 
elements. During implementation, a choice must be made as to the data structure to be used in storing the 
evaluation table of a polynomial. Although in the proof of Lemma 2, we assumed that we need constant- 
time to retrieve pg given a, our implementation uses an associative data container, based on a red-black tree 
which has a O(logn) access time. It can be checked that this does not affect? the asymptotic running time 
for the interpolation and evaluation algorithms. (The choice to use a logarithmic-time container instead of 
a constant-time container was made merely for convenience reasons; the C++ Standard Template Library 
provides the map data type, while there is no corresponding type for a hash table.) 

The C++ data structure declarations and function signatures associated with evaluation and interpola- 
tion of polynomials are shown in Listing 1. The code listing shows the two most important NTL types that 
are used in the PCPP implementation. GF2E is the type of an element in an extension field of GF(2), and 
GF2EX is the type of a polynomial with coefficients of type GF2E. Before its first use, GF2E needs to be ini- 
tialized with an irreducible polynomial in GF'(2)|z] to specify the extension of GF'(2). More details regarding 
the NTL programmatic interface to finite field computations can be found at http://www. shoup.net. 


Listing 1: Evaluating and interpolating polynomials on fields of characteristic two 


/** Evaluation table of a function f on elements of a field of characteristic 2. 
*/ 

struct eval_table{ 
// Stores pairs <x,f(x)> 


?Comparison of two field elements in traversing the red-black tree takes O(log|#|) time, same as that for any other field 
operation. As before, we take field operations to have unit time cost. 


// (1tGF2E is the comparison operator on field elements) 
map<GF2E,GF2E,1tGF2E> evalmap; 


// Given x, return f(x), assuming <x,f(x)> is in evalmap. 
// Running time: O(log n) 
GF2E query (const GF2E& x) const; 


// Store the pair <x,y> 
// Running time: O(log n) 
void insert (const GF2E& x, const GF2E& y); 


// Clear the evaluation table 
// Running time: O(1) 
void clear(); 

}; 


/** Store in <table> the evaluation of the polynomial <poly> on the set of 
* n=2°k field elements spanned by the k elements in <bases>. 
* Running time: O(n (log n)“2) 
uf 

void eval_poly(eval_table& table, const GF2EX& poly, const vec_GF2E& bases); 


/** Make <poly> the polynomial interpolated from <table>, the 
* evaluation table of a function at each element spanned by <bases>. 
* Running time: O(n (log n)°“2) 
47 
void interpolate_poly(GF2EX& poly, const eval_tableé& table, const vec_GF2E& bases) ; 


2.2 The Prover 


In this section, I will detail the implementation of the PCPP prover, the program that, given a polynomial 
over a field F of degree less than d and a subspace L C fF, constructs a valid probabilistically checkable 
proof of proximity that shows that the polynomial’s evaluation table over L is in RS(F,L,d). The algorithms 
that appear in this section and the next are taken from [BSS04], the full version of the conference paper by 
Ben-Sasson and Sudan. 

Throughout this paper, we will only consider the case when d is fixed to be |S|/8. As is shown in [BSS04], 
the more general case can be reduced to a sequence of these special PCPPs. It is convenient to think of 
the proof, not as a string of bits, but as an oracle that can be queried; the advantage of this viewpoint will 
become very apparent when we describe the verifier. The basic idea of the PCPP construction is that we 
convert a univariate polynomial of degree less than n/8 into a bivariate polynomial of degree less than ,/n in 
each variable and then we invoke the Polischuk-Spielman analysis from [PS94] to reduce testing of bivariate 
polynomials to testing of univariate polynomials of approximately the same degree. To describe the proof 
more precisely, we will introduce the same notation as that used in [BSS04]. Throughout, assume that we 
are given a specific set of bases (bi,...,b,) for a linear subspace L of the field and that n = 2* = |L|. Define 
the following: 


e Lo = span(by,.. ., 1K /2)) 
e Lo =span(by,..-,b)x/2|+2) 


e Te = span(bjx/2|+1; es ke sthey bx) 


© 42) = Tact. (2 — &) 
© Li = span(q(b\x/2|+1),---» 9(bx)) 
e As = {6+ ala € Lo}, the affine shift of Ly by 6 


span(Lo, b|x/2|+3) if B € span(bjx/2)+41, {4/2} +2) 


For 6 € Li, La = 
sen ae { span(Lo, GB) otherwise 


e T= {(7,4(7))l7 € L} 


Next, we make a few observations that the reader can easily verify to follow directly from the above defini- 
tions. Firstly, q(x) is a GF(2)-linear map with Lig as its kernel (see Proposition 8 in [BSS04]). Secondly, for 
all 6 € fn, |Lg| = 4|Lo| = 8|Lo| from the definition of Lg. Thirdly, it is clear that for all Pe Lis Lg isa 
linear set while Ag C Lg is not linear unless B = 0. Finally, note that 


P=) (| As ea(s) 
Bef, 


which follows from the fact that q is a linear transformation with kernel Lo. 
Using the above notation, the structure of the Reed-Solomon PCP of proximity oracle is: 


Definition 3 ([BSS04], Definition 4) The proof oracle for a codeword of the RS-code RS(GF(2!),L,|L|/8) 
is defined by induction on k = dim(L). If k < 6, then it is empty. Otherwise, the proof is a pair 

mn = {f,11} where f is a partial bivariate function over partial domain S C GF(2') x GF(2') and TI is 

a sequence of PCPPs for RS-codes over smaller linear spaces. 


Partial domain S: Let Sz = Lg x {q(G)} and let T = {(7,q(7))ly € L}. Then 


s=| U s3)-T= U (ag - 4g) x {AD 


Beli Beli 


Auxiliary proofs II: For each B € Ly and B= q(B) € £1, Il has one PCPP for an RS codeword 


over Lg of degree |L|/8, denoted 7,”. For each a € Lo, I includes a PCPP for an RS codeword 


over Ly of degree |Lo|/8, denoted me. Formally, 


Il = {nf'|8 € Ls} U {abla € Lo} 


The C++ declaration of the PCPP object, shown in Listing 2, reflects the recursive structure of the proof 
described above. 


Listing 2: Declaration of the PCPP data type 


/** Analog of eval_table for a bivariate polynomial 
*/ 

struct biv_eval_table { 
map<GF2E,eval_table,1tGF2E> evalmap; 


GF2E query(const GF2E& x, const GF2E& y) const; 
void insert (const eval_table& xvals, const GF2E& y); 


void clear(); 


}; 


/** Representation of a PCPP oracle for RS-codes 
*/ 

struct poly_oracle{ 
// Evaluation of f on S 
biv_eval_table eval; 


// Pointers to the auxiliary proofs II 
vector<poly_oracle*> proof; 


// Pointer to additional PCPPs 
poly_oracle* next; 


hi 


Now, having specified the form of a correct PCPP in Definition 3, we need to specify its contents, the 
bivariate polynomial f and the auxiliary proofs II. 


e Construction of f: Given the polynomials p and g, construct the unique bivariate polynomial 


Q(z,y) with deg,(Q) < deg(q) and deg,(Q) < |deg(p)/deg(q)| such that p(x) = Q(z, q(x) for all 
zéL. That such a @ exists and is unique is given by Proposition 7 in [BSS04], and the algorithm to 


compute it is discussed below in 2.2.1. In our case, p is of degree n/8 while qg is roughly of degree \/n; 
so, @ is roughly of degree ,/n in x and ,/n/8 in y. Now, define f(a,@) = Q(a, 8) for all (a, 8) € S. 
This is the bivariate function whose evaluation table over S is provided in the PCPP. 


e Construction of II: Denote by  : T — F the bivariate polynomial defined by f(z, g(z)) = p(z) 
for all c € L 3. Then let f be the function that agrees with f on S and f on JT. Also define 
rape : {al(a, 8B) € SUT} > Fas fis (a) = f(a,@). Similarly, define fig : {B|(a,B) Ee SUT} AF 
as f[&(8) = f(a,G). It is fairly easy to verify (see Proposition 10 in [BSS04]) that for 6 € Ly and 
B = 4(6), {a|(a,8) € SUT} = Lg and that for a € Lo, {B|(a,8) € SUT} = Ly. Then for Be Ly 
and 6 = q(G), ™, is the PCPP proving that f|5’ is a codeword in RS(F, Las |Lg|/8), and for a € Lo, 
ms is the PCPP proving that fig is a codeword in RS(F, Ln, |Li|/8). 


This same description in C++ code is given in Listing 3. 


Listing 3: Construction of Reed-Solomon PCPPs 


// ad = |L\|/8 
void ReedSolomon_PCPP (poly_oracle& pcpp, const GF2EX& poly, const vec_GF2E& L_bases) { 
vec_GF2E LOO_bases, L10_bases, LO_bases, Ll_bases, Lbeta_bases; 
long k = L_bases.length(), i, Jj; 
GF2EX q, frow, fcol; 
vec_GF2EX f; 
vec_GF2E LO_span, L10_span, Lbeta_span, Ll_span; 
GF2E beta0, beta, tmp; 
eval_table coleval, roweval; 


biv_eval_table bioracle; 


if (L_bases.length() > 6){ // 6 because floor(k/2)+3<k for k>=7 


3Notice that a verifier does not need a separate evaluation table for p because it can simply use the provided evaluation 
table for p; separately evaluating # and f is crucial to proving the soundness of the verifier. 


// get the bases for fo, Lo and fy. 
get_L0OO_bases(LO0OO0_bases, L_bases) ; 
get_LO_bases(LO_bases, L_bases); 

get_L10_bases(L10_bases, L_bases) ; 


// get q of degree approximately ./n 
LinearizedPoly(q, LOO_bases); 


// get the bases for Ly 
get_L1l_bases(Ll_bases, L10_bases, q); 


// get all elements in ae Io and L;i for later use 
get_span(L10_span, L10_bases); 
get_span(LO_span, LO_bases); 


get_span(Ll_span, Ll_bases); 


// get the bivariate polynomial f 
create_bivariate(f, poly, q, L10_span); // given in Listing 4 


// evaluate the bivariate polynomial f on SUT 
for (i=0; i<L10_span.length(); itt) { 
betaO = L10_span[i]; // for each BEL, 


// get the bases for Lg 
get_Lbeta_bases (Lbeta_bases, beta0, L_bases); 


// Find f(a,q(B)) for all aé€Lg, i.e. the q(B)-row of SUT 
roweval.clear(); 
eval_poly(roweval, f[i], Lbeta_bases); 


bioracle.insert (roweval, EvalLinearizedPoly(q, beta0)); 


// Construct the auxiliary proofs II 
vector<poly_oracle*> proofs (L10_span.length() + LO_span.length()); 


// Construct the proofs m, for all BEL, with B=4q(6) 
for (i=0; i<L10_span.length(); itt) { 

proofs.at(i) = new poly_oracle; 

get_Lbeta_bases (Lbeta_bases, L10_span[i], L_bases); 


// proof that f\§ is in RS(F,Lg,|L,|/8) 
ReedSolomon_PCPP (*proofs.at(i), f[i], Lbeta_bases) ; 


// Construct the proofs nt for all a€ Lo 
for (i=0; i<LO_span.length(); i++) { 
coleval.clear(); 


proofs.at(it+tLl_span.length()) = new poly_oracle; 
for(j=0; j<Ll_span.length(); j++) { 
coleval.insert (L1_span[j], bioracle.query(LO_span[i], Ll_span[j])); 


interpolate_poly(fcol, coleval, Ll_bases); 


// proof that fk is in RS(F,In1,|L1|/8) 
ReedSolomon_PCPP (*proofs.at(itLl_span.length()), fcol, Ll_bases) ; 
} 
pcpp.eval = bioracle; 
pcpp.proof = proofs; 
} 
pcpp.next = 0; 
return; 


} 


2.2.1 Running Time of the Prover 


Let T(n) denote the running time of the algorithm shown in Listing 3 for n = |L|. Let Ty(n) denote the 
time required to find the bivariate polynomial f. Then from inspection of the algorithm, it can be seen that 
asymptotically: 


Rite: T;(n) + O(2'*/21 (8. 9lk/2] log?(8 2 Dui2 1) 4 2fk/2] -T(8- Pale 44. Q1k/2] ar(giriai) ifk>6 
(cae ifk <6 
a { T;(n) + O(nlog?(n)) + 21/21. 7(8 214/21) 4.4. 214/21]. T(Q/*/21) ifk > 6 
~ 10 ifk <6 


where k = log(n). So, we need to find T(n) in order to solve the recurrence above for T(n). Recall that f is 
the restriction to S of a bivariate polynomial @ which satisfies the relationship, Q(z, q(x)) = p(z), on T and 
which has deg,(@) < deg(q) and deg,(@) < |deg(p)/deg(q)|. Also, notice from Listing 3 that we represent 
a bivariate polynomial over z and y as a sequence of univariate polynomials over z, one for each value of y in 
the domain. The algorithm that we use for calculating Q uses division over the ring of bivariate polynomials. 
Note that if we fix a lexicographic ordering on terms with z > y, then dividing p(z) by g(x) — y, we obtain 


p(x) = Q(z, y) - (a(z) — y) + Q(z, y) 


It can be easily checked that this remainder Q(z, y) has the requisite properties. For our representation, we 
want to evaluate Q(z, G) for all G € Ly. The following lemma asserts that Q(z, @) is the remainder after the 
univariate division of p(x) by g(z) — GB. 


Lemma 4: Let F[z, y] be the ring of bivariate polynomials with the lexicographic ordering z > y on 
terms. Suppose f € F[z] and g € Flz,y]. Also, g(z,y) = m(z) + n(y) where m € Fla] and n € Fly]. 
Let A(z, y) be the remainder after dividing f(z) by g(x,y). Then, for any a € fF, if ha(z) is the 
remainder after the univariate division of f(x) by g(z, a), then ha(z) = h(z, a). 


Proof: Fix a € F. Let f(z) = s(z,y)g(z,y) + h(z,y) and f(z) = sa(x)g(z,a) + ha(z). We 
have deg,(h) < deg,(g) anddeg(ha) < deg(g(z,a)) = deg,(g). Now, s(z,a)g(z,a) + h(z,a) = 
So(t)9(z, a) + ha(z), or 

h(z, a) — ha(#) = 9(2, a) (a(x) — (2, a)) 
If (sq (xz) — s(x, a)) is not zero, then the degree of the right hand side is at least deg(g(z, a) = deg,(g) 
and so must the degree of the left hand side, contradicting what we said before. So, h(z,a)—h.(z) = 0. 


Thus, we can represent @ by performing one univariate division for each 6 = q(B) € Ly. This algorithm in 
C++ code is given in Listing 4. 


Listing 4: Construction of the bivariate polynomial Q 


void create_bivariate(vec_GF2EX& bivs, const GF2EX& P, 
const GF2EX& q, const vec_GF2E& L10_span) { 


GF2EX qp; 
GF2E tmp; 


bivs.SetLength(L10_span.length()); 


for (long i=0; i<L10_span.length(); i++){ // For each Be fT, 
tmp = EvalLinearizedPoly(q, L10_span[i]); 
qp = q - GF2EX(0,tmp) ; 
bivs[i] = P % qp; 

} 


Univariate division of two degree d polynomials can be reduced to multiplication of two degree d polyno- 
mials using the Sieveking-Kung method (see [vzGG99]); thus, univariate polynomial division can be achieved 
in O(dlog d) field operations. This is how polynomial division in NTL is implemented. Since we are per- 
forming ,/n divisions of an n/8-degree polynomial, we have for this algorithm, T;(n) = O(n3/? log(n)). 

Then, we can rewrite the recurrence for T(n) as: 


Baha O(n3/? log(n)) + 2/*/21.7(8. 2l#/2J) 4.4. alk/2] .7(21*/21) ifk > 6 ; 
GD i ifk <6 ) 


again with k = log(n). 
Lemma 5: T(n) = O(n3/? logn). 


Proof: We prove by induction that T(n) < c-n3/?(logn — 6) for an appropriate choice of c and for 
sufficiently large n. For k > 6, 8- 2l*/?] < 2* and hence, we start by assuming that the bound to be 
proven holds for the recursive calls in (1). For large enough n, there exists a constant d such that: 


T(n)<d- n3/? log(n) +4 2fk/2| -T(8- gery 44.21/21 mpc 
<d-k-n3/? 4 6. 2/k/2199/2931k/21/2(3 4 |k/2| — 6) + 4c - gl*/21031*/21/2( ke /2] — 6) 


penal? a fna/2 {| %| e aja ([e| _ 
<d-k-n staat (|; | 3) + En (|; 6 


= n3/?(dk + s(h —9)) 
<c-n'/?(k — 6) 


The first inequality follows from the defining recurrence relation for T(n) in (1). The second inequality 
follows from the inductive hypothesis. The third inequality follows from observing that for k > 22, 
g + 4 | £] < x —land2+ $ [=] < k — 1. The fourth equality is algebra. The fifth inequality follows 
from having an appropriately large c. As for the base case of the induction, we choose a c so that 
c- n3/?(log n — 6) is larger than T(2'*) and T(21?) because these are the values that T(2?°) depends 
on. 
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So, finding the bivariate polynomial f is the main bottleneck in constructing the PCPP and leads to the 
rather large running time of the prover in Lemma 5. It remains an open question whether the running time 
of the prover for this PCPP system can be improved. 


2.2.2 Proof Size 


As mentioned in the introduction, the size of the PCPP is an important parameter in many applications of 
the theory. Having a nearly linear proof size has consequences for the construction of locally testable codes, 
for example. We will show that our PCPPs indeed have this property. 

Looking at Listing 3, the proof size*, S(n), can be recursively characterized as: 


[k/21 .(g. glk/21) 4 9fk/21.7(g. glk/2J -Qlk/2) .p(Qtk/21) | 
sin)={? (8. 2lk/21) 4+ 9 T(8-2l*/2]) 44.2 T(2'*/21) ifk>6 


0 ifk<6 
_ f 8n4a2lk/2l 77g. Qlk/21) 4.4. Qlk/2).T(Q/*/21) ifk > 6 (2) 
S10 ifk <6 


Lemma 6: S(n) = O(nlog*n) 


Proof: We prove by induction that S(n) <c.- nlog* n for an appropriate value of c. We will assume 
that this bound holds for the recursive calls in (2). Then, we have: 


4 4 
S(n) < 8n + 2lk/21 c.g. 2lk/2 (|| +3) 44. Q1K/21 og. glh/21 B 


k ? k]* 
= 8n+ 8cn (|; | + 3) + 4cn BH 
2 2 
, 4 
< 8n+ 12cn (|; | +3) 


< cnlog*n 


The first inequality is the inductive hypothesis. The second equality is from simplification. The third 
inequality follows from [k/2] < |k/2| +1. The fourth inequality holds for large values of k (since 
12 < 2+). For the base case of the induction, take c to be large enough so that the bound holds for the 
values of n where the fourth inequality is true. 


Although the proof to the lemma above treats the bounds loosely, the O(n log* n) bound to the solution 
of the recursion in (2) is pretty tight. In fact, we find from running our program that S(n) = qn log* n is a 
good bound for the proof size. 


2.3. The Verifier 


The verifier for the Reed-Solomon PCPP uses the bivariate polynomial test analyzed in [PS94] to check that 
the provided input is indeed close to a Reed-Solomon codeword. All this is done by querying only a constant 
number of field elements! The test made by the verifier is described in [BSS04] as follows: 


Definition 4 ([BSS04], Definition 5) The verifier for proximity to RS(GF(2‘),L,d = |L|/8) receives 
as input the parameters GF(2*), a basis (bi,..., b,) for L and degree parameter d = |L|/8. It has oracle 
access to a purported codeword p : L — GF(2*) and its purported proof 7 = {f,II} and is denoted 
Va” (GF(24), L, d). If |L| < 64 (in which case 7 = Q), the verifier reads p in entirety and accepts iff 


4We count the number of field elements in the proof. Counting the number of bits leads to another factor of log F. 
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p € RS(GF(2‘), L,|L|/8). Otherwise, it computes |k/2]| and performs one of the following two tests 
with probability half each. 


Row-Test Pick random B € Lit set GB = q(6), compute basis for Lg and recursively run 
(f\57 5°) 
Vas” : (GF(2*), Lg, |Lgl/8). 


at at 
Col-Test Pick a € Lo at random, compute basis for L; and then recursively run Vizl™=) (GR(2ty, Ly, |L£1|/8). 


In the above definition, f is the bivariate function that agrees with the evaluation table of f on S and 
with p on T. Recall from Section 2.2 that f is is a partial bivariate function with the partial domain T, 
defined to be f(z, q(x)) = p(z). So, at the top level, when a row or column of f is selected, some of its values 
can be retrieved from querying the bivariate polynomial evaluation table (for f) provided in the PCPP 
while for others, the input string (the evaluation for p) must be queried. As the verifier gets deeper into the 
recursion tree, determining where to look in the PCPP for an evaluation of f requires looking back at the 
decision tree of choosing row-tests or column-tests and determining at each level if the needed evaluation 
of f is contained in the bivariate evaluation table at that level. Instead of complicating the implementation 
of the verifier, it is easier to restructure the proof as an oracle program that automatically determines the 
correct place to look in itself for an evaluation of a Such a program implemented in C++ is shown in 


Listing 5. 


Listing 5: Implementation of a Proof Oracle 


enum Level {TOP, ROW, COL}; 
struct verifier_oracle { 
// Evaluation of f on S 


const biv_eval_table* table; 


// Looking at row or column <header> of f 


GF2 


// Pointer to the proof oracle that should be queried for 
// evaluations on T 
const verifier_oracle* parent; 


// If this is the top level, evaluation table of the univariate 
// polynomial p 


const eval_table* orig_poly; 


// The level: top, a row, or a column 
Level lev; 


// The linearized polynomial q 


GF2 


// Constructor for the top level 
verifier_oracle(const eval_table* orig) { 
orig_poly = orig; 
lev = TOP; 
parent = 0; 


} 


E header; 


EX q; 
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// Constructor if this is the row or column projection 


verifier_oracle(const verifier_oracle* par, const biv_eval_table* tab, 


Level roworcol, GF2E& val, GF2EX& qp) { 
parent = par; 
table = tab; 
lev = roworcol; 
header = val; 
q = pr 


// Recursive query 
GF2E query(const GF2E& ask) const{ 
if(lev == TOP) 
return orig_poly->query (ask) ; 


else if(lev == ROW) { 
if (EvalLinearizedPoly(q,ask) != header) { 
return table->query (ask,header) ; 
} 


else { 


return parent-—>query (ask) ; 


} 


else { 


if (EvalLinearizedPoly(q,header) != ask) { 
return table->query (header, ask) ; 

} 

else { 
return parent-—>query (header) ; 


}; 


Using this proof oracle structure, the implementation of the verifier is simple and direct. 


below. 


Listing 6: Implementation of the PCPP verifier of [BSS04] 


It is shown 


/** Verify if indeed <proof> is a valid PCPP that shows that <poly> 


* is the evaluation table of a polynomial of degree less than |L|/8. 


aL 
bool verify_proof (const vec_GF2E& L_bases, const eval_table& poly, 
const poly_oracleé& proof) { 


verifier_oracle* root = new verifier_oracle(&poly); 
return verify(L_bases, *root, proof); 


/** A helper procedure for the above 
x7, 
bool verify(const vec_GF2E& L_bases, const verifier_oracle& oracle, 
const poly_oracleé& proof) { 
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long k = L_bases.length(), index, i; 
vec_GF2E LOO_bases, L10_bases, LO_bases, Ll_bases, Lbeta_bases, L_span; 
poly_oracle *rowproof, *colproof; 


verifier_oracle* next; 
GF2E choice, qchoice; 
GF2EX q, poly; 

int rand; 


// if k<7, simply read in all of the input, interpolate a 
// polynomial, and check its degree 
if(k < 7){ 

get_span(L_span, L_bases); 

eval_table polyvals; 


// maximum of 64 queries here 
for(long i=0; i<L_span.length(); itt) { 
polyvals.insert (L_span[i], oracle.query(L_span[i])); 


interpolate_poly(poly, polyvals, L_bases); 


if(deg(poly) < power_long(2,k-3) ) 
return true; 

else 
return false; 


else { 


// get the bases for foo, Lo, 1 and Ii 
get_LOO_bases(LOO0O_bases, L_bases) ; 
get_LO_bases(LO_bases, L_bases); 
get_L10_bases(L10_bases , L_bases); 


LinearizedPoly(q, LOO_bases) ; 
get_Ll_bases(Ll_bases , L1l0_bases, q); 


// flip a coin 


if (getRandomBit() == 1){ // check row 
index = 0; 
for(i=0; i<L10_bases.length(); it++){ // choose random element BE Ii 
rand = getRandomBit (); 
index = index + rand * power_long(2,i); 


choice += rand * L10_bases[L10_bases.length()-i-1]; 


EvalLinearizedPoly (q, choice) ; 


// get 
rowproof = proof.proof [index]; 


next = new verifier_oracle(&oracle, &(proof.eval), ROW, qchoice , q); 
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get_Lbeta_bases (Lbeta_bases, choice, L_bases); 
// recurse 
return verify (Lbeta_bases, *next, *rowproof) ; 


} 


else { // check column 
index = 0; 
for (i=0; i<LO_bases.length(); i++) { // choose random element a € Lo 
rand = getRandomBit (); 
index = index + rand * power_long(2,i); 
choice += rand * LO_bases[LO_bases.length()-i-1]; 


// get me, 
colproof = proof.proof[index + power_long(2, L10_bases.length())]; 


next = new verifier_oracle(&oracle, &(proof.eval), COL, choice, q); 
//recurse 
return verify(Ll_bases, *next, *colproof); 


The query complexity of the verifier is immediate. The verifier queries at most 64 field elements and, 
hence, at most 64log|F| bits. Next, we look at some other complexity parameters associated with the PCPP 


verifier. 
2.3.1 Randomness Complexity 


In [BSSO5], it is ascertained that the randomness complexity is r(k) < k+c-logk for a constant c. Here, 
we give a tighter bound for r(k). 

First of all, note that the exact number of coins flipped by the verifier depends on its decision tree of 
choosing between row-tests and column-tests; this is so because |L,| and |L1| are different for all 6. We 
want to determine the maximum number of coins that can be be flipped by the verifier, i.e. an upper bound 
on r(k). Thus, looking at the definition of the verifier, we can write: 


Os, a eee ea er 


Lemma 7: r(k) <k+4|log(k —6)| —1 
Proof: Can be verified immediately through a straightforward induction. 
The randomness complexity also allows us a way to bound the proof size, because S(n) < ar) q(n) 
where S(n) is the proof size and q(n) is the query complexity. So, once again, S(n) = O(nlog* n). 
2.3.2 Running Time of the Verifier 
Let ty(k) denote the running time for the verifier. Then, we have that: 


Lemma 8: ty(k) = O(k?). 
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Proof: From inspecting the algorithm given in Proposition 8 of [BSS04], g(x), the linearized poly- 
nomial, has k terms and can be computed in time O(k*). It can be evaluated in time O(k?). Thus 
computing the basis for L, takes time O(k*) and similarly for computing the basis for Lg given B. 
Therefore, we can write the following recursion: 


ty(k) = O(k*) + max(ty([k/2] + 3), tv([k/2])) 
= O(k*) + ty(|k/2| +3) 


since ty is monotonically increasing. A simple induction shows that ty(k) = O(k’). 


3 Conclusion 


Our tight bounds on the complexity parameters related to Reed-Solomon PCPPs show that it is indeed 
feasible in practice to create PCPPs as a semantic analog to error-correcting codes. The question of improving 
the time performance of the prover remains open. 
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